Backing Up with the dump Utility (part 2) - What a dump Backup Looks Like

1/30/2012 11:31:57 AM

3. What a dump Backup Looks Like

This section explains one primary difference between dump and its cousins, tar and cpio. dump writes a table of contents at the beginning of each volume while tar and cpio do not.

3.1. dump records an index on the volume

The index is read during an interactive restore, allowing you to run commands such as cd and ls on this table of contents, viewing and selecting files that you want for the restore. This interactive restore feature is one of restore’s biggest advantages over tar and cpio. Note one important thing about this index: it is made at the beginning of the backup, before it has tried to actually back up anything. The presence of the index makes the interactive restore efficient because you don’t have to read the whole volume before you can see what’s on it. However, the fact that it’s created before the backup data is written, and possibly minutes or hours before the data is written to tape, means that files made during the backup are not included, and files deleted during the backup are listed on the index but are not actually on the volume.

3.2. Using the index to create a table of contents

You can create a table of contents of a dump volume by physically reading the contents of the index that dump creates and seeing what dump intended to write to the volume. Also, it is important to mention that this reading of the volume in no way guarantees the integrity of the actual file on the volume any more than an ls -l on a file in a directory verifies its integrity. You may be wondering why this discussion is included here, in the section about dump; it is because making this table of contents should be a part of every dump backup that you take. Having said that, how do you create a table of contents of a dump file? First, what does “dump file” really mean? Perhaps an illustration would help; see Figure 3.

Figure 3. The format of a dump tape

A volume created by dump may have multiple dump files, sometimes called partitions, on it. Each file ends in an end-of-file (EOF) mark, symbolized in Figure 3-4 by shaded areas.

You have two options if you want to obtain a table of contents for dump file 3 in Figure 3-4:

You can tell restore to read the third file on the tape using the s option; this causes restore to skip files 1 and 2 and read file 3. (This option does not apply to disk-based dump backups.)
You can manually position the tape (using mt or tpctl) so that it is sitting at the beginning of that file, then tell restore to read it as if it were the first file on the tape.

You must know the blocking factor in which the volume was written. If you are not sure, try the default by not specifying a blocking factor.

The first method is the easiest, because it involves only one step. The syntax of the command is as follows:

$ restore tsbfy file blocking-factor
							device

To read the third dump file on the tape with a blocking factor of 32, use the following command:

$ restore tsbfy 3 32 /dev/rmt/0cbn

Here’s a list of the options used and what they do:

The t option tells restore to read the volume index and provide a table of contents.
The s option, and its accompanying argument 3, tells restore to read the third dump file on a tape.
The b option, and its accompanying argument 32, tells restore that you used a blocking factor of 32 when you wrote this dump file.
The f option, and its accompanying argument dev, specifies that the dump file is on that device.
The y option tells restore to continue in the case of errors, instead of asking you if you want to continue.

If you do choose to manually manipulate the tape, as in the second option, you need to be familiar with your Unix version’s magnetic tape command. This is usually mt. It has five options—status, rewind, offline, fsf, and fsr—four of which you might use when manipulating dump tapes. The format of the command is:

$ mt -t device argument

If you are planning to position the tape, make sure you are using a nonrewinding device, such as /dev/rmt/0n. Otherwise, it rewinds as soon as you finish positioning it!

Some versions of mt use a -f instead of a -t. The device argument is the no-rewind tape device that you are using, such as /dev/rmt/0n. Now specify one of the following for argument:

status: This gives you the ioctl status of the tape device. It does not require an accompanying argument.
rewind: This rewinds the tape to the beginning. This option is spelled rew on some versions of Unix. It does not require an accompanying argument.
offline: This ejects the tape from the tape drive. This option is spelled offl on some versions of Unix. It does not require an accompanying argument.
fsf x: This is short for “forward space file.” It positions the tape forward x file marks, where x is a number greater than 0. (If you do not specify a value for x, it defaults to 1.) If you are at the beginning of the tape, you are at file 1, so if you want to be at file 3, you need to go forward two files. This requires an fsf 2, as in mt -t device fsf 2.
fsr x: This is short for “forward space record,” and is not needed when manipulating dump tapes. (If you do not specify a value for x, it defaults to 1.)

The following are examples of how to use the mt command. To rewind the tape /dev/rmt/0cbn, issue the command:

# mt -t /dev/rmt/0cbn rewind

To fast-forward the tape /dev/rmt/0cbn to the second file on the tape, issue the command:

# mt -t /dev/rmt/0cbn fsf 1

To eject the tape /dev/rmt/0cbn, issue the command:

# mt -t /dev/rmt/0cbn offline

To get the status of the tape /dev/rmt/0cbn, issue the command:

# mt -t /dev/rmt/0cbn status

Once you have positioned the tape to the proper file, simply use the same restore command as before, leaving off the s option and its argument:

$ restore tbfy 32 /dev/rmt/0cbn

Whichever method you use, the table of contents is sent to standard output, which you should redirect into a file. One important thing to note about this output is that the name of the filesystem dumped to this volume is not in the output. This table of contents is relative to that filesystem, whatever its name was. For example, if you backed up /var, and you were looking for /var/adm/messages, the output would look something like this:

345353  ./adm/messages

I recommend that you create a table of contents for each dump volume when you make it and store this output in a file that matches the name of the volume. Obviously, you should use a unique name, like:

./dump.system.filesystem.level0.May19.2006

Saving tables of contents in this way is very handy when you’re searching for a file and you can’t seem to find it on any volume. A quick grep of all the dump files shows you which volume you need.

A Day Late and a Dollar Short

I was once told that data needed to be recovered from a machine that had been decommissioned 10 years earlier. I was told the name of the machine and about where the tapes were stored, so I started digging.

When I found the tapes, once I scrounged a tape drive with low enough density to be able to read them, I discovered that they were in a dump format that was no longer supported! I found the source code for the original restore program (the BSD 4.1 one in this case), downloaded it to my machine (SunOS 4.0.1 in this case, a BSD 4.3-like system), and started working on porting the old program. No good. I soon realized it would take me weeks to do it; the filesystem and dump formats had changed that much.

There had to be a different way, so I searched the data vaults for more tapes. Luckily, I found another stack of tapes, marked as being in tar format. I had lucked out! Most of these tapes were still readable, and the data came off the first try.

Moral of the story: when you decommission a machine, make an archival copy of the data in every format you can, on every type of media you can. Some, like dump, are very efficient but might not be supported someday, while others, like tar and cpio, have stayed around year in and year out. Times change, media changes, formats change, so make as many variations as you can so your data will be retrievable for as long as possible.

This made me a big fan of using tar for archival purposes, but that makes excellent sense. Its name stands for Tape ARchiver, after all.